Intra-process progress exchange medium by frankmcsherry · Pull Request #807 · TimelyDataflow/timely-dataflow

frankmcsherry · 2026-06-14T01:14:31Z

An experimental but not yet better progress tracking medium for multiple worker threads. The design is a concurrent data structure that supports more in-place aggregation, and is designed to favor laggards by having the folks committing updates perform consolidation as they do, leaving behind less of a mess than MPSCs do.

…ll-reduce A Chain<T> is a single-writer, multi-reader chain of atoms intended to eventually replace the Progcaster's intra-process leg; a Mesh<T> bundles one chain per writer, with readers that sweep all chains. Atoms merge through T: Chainable, a commutative monoid, and may be compacted (merged with adjacent atoms, never split) before a reader folds them. Each reader folds every atom sent after its registration exactly once; live state is bounded by O(#readers) independent of the send count. Design highlights: - Forward links; node payload (value, next) behind one RwLock so walkers snapshot consistently and compaction mutates atomically; per-node atomic holders counts with RAII pins. - Writer fast path merges in place into an unpinned newest node (zero allocation in the common case), re-checking holders under the payload write lock to exclude concurrent pinning. - Readers walk pin..=newest hand-over-hand, folding forward; recv_with hands individual atoms to the caller. - Compaction absorbs a successor only when both nodes are unpinned and the successor is not newest; bypassing pinned nodes is unsound (their frozen next pointers are side entrances that later absorptions would move values behind). Boundedness instead comes from an oldest pointer and a full sweep at every recv and reader drop; the prefix behind all pins is reclaimed by Arc refcounts. - Lock order: oldest mutex, then newest mutex, then payload locks older before newer. Tests cover single/multiple/late readers, repeated and empty recvs, laggard compaction bounds, reader-drop healing, the in-place fast path (chain length exactly 2 after catch-up plus N sends), mesh delivery, and a randomized multi-threaded stress test asserting totals and per-chain length <= #readers + 2 at quiescent checkpoints. See chain-design-notes.md for the contract, the compaction safety argument, and deviations from the original sketch. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Per-writer chains fail the primary goal of cross-writer cancellation: two workers sending {(T,+1)} and {(T,-1)} at distinct increasing T build internally-incompressible per-chain content whose union cancels to nothing, leaving laggards O(elapsed) fold work. A single shared chain merges concurrent sends at the head, so accumulated nodes hold the net. - Chain<T> is now a cloneable multi-writer handle; send(&self, value) locks the newest mutex (serializing writers), then the newest node's payload write lock (consistent with the documented lock ordering), and merges in place when holders == 0, allocating only when a reader has just caught up and pinned the head. - Mesh<T> retained as a per-writer comparison structure, documented as benchmark-only with its known cancellation pathology. - Tests: suite adapted to the new API; added multi-writer tests (sequential two-handle, cross-writer cancellation state bound, concurrent writers with one and many readers); stress test now shares one chain across writer threads. 15 chain tests pass; clippy clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Three scenarios (all keep up; laggard; unread backlog) across N workers and cancellation fractions, with results and analysis in chain-bench-results.md. Headline: the chain bounds backlog state and laggard work by the live cancellation window (orders of magnitude below the MPSC baseline, flat in N), and pays 2-7x on the tight-loop send path where the head mutex contends. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

A per-send version counter enables cheap change detection: recv returns without taking any chain lock when no atom has been committed since the reader last caught up, and the compaction sweep runs only after a productive walk (when pins actually moved). This keeps spinning readers off the chain's shared locks. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ing-key axis Two more points on the merge-eagerness spectrum (ledger: total merge via a shared consolidated map; cells: per-reader in-place accumulators) and a key-type axis (u64 vs Box<[u64;3]>) modeling allocating timestamps, whose clone/drop traffic is the scaling pathology the chain targets. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Synthetic three-scenario sweep plus an allocating-key matrix and a record of real-workload measurements from a separate experimental Progcaster wiring (not in this branch). Bottom line: the chain wins laggard work and backlog memory decisively, loses single-socket throughput, and allocation narrows but does not close that gap at 10 cores. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

frankmcsherry and others added 6 commits June 12, 2026 16:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intra-process progress exchange medium#807

Intra-process progress exchange medium#807
frankmcsherry wants to merge 6 commits into
masterfrom
progress_chain

frankmcsherry commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

frankmcsherry commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant